35 research outputs found

    Riskiskoorid ja nende prognoosivõime komplekshaiguste jaoks

    Get PDF
    Väitekirja elektrooniline versioon ei sisalda publikatsiooneGenotüpiseerimise ja sekveneerimisega seotud tehnoloogiate odavnemine on plahvatuslikult kasvatanud geneetiliste andmete hulka, võimaldades nende ja olemasolevate fenotüübiliste andmete kombineerimisel paljude tunnuste ning haiguste geneetilist tausta uurida. Kõige uuritumad geneetilise varieeruvuse allikad on ühenukleotiidilised polümorfismid (SNPd). Enamasti on sagedaste SNPide mõjud tunnustele üsna väikesed ning seetõttu on nad eraldiseisvana väikese prognoosivõimega. Seevastu paljude SNPide efektide kokku kombineerimisel saadav tunnus, mida nimetatakse geneetiliseks riskiskooriks, on mitmete komplekshaiguste prognoosimisel osutunud väga väärtuslikuks. Töö raames tutvustatakse topeltkaalumise meetodit, mis kaasab geneetilisse riskiskoori korraga paljusid vähekorreleeritud SNPe olemasolevast ülegenoomsest uuringust (GWAS). Antud metoodikat rakendatakse nii simulatsioonides kui ka Eesti Geenivaramu (EGV) andmetel, et iseloomustada selle töötavust erinevate haiguste korral ning võrrelda seda eelnevalt kasutatud lihtsamate meetoditega. Samuti uuriti, kas ja kuidas on geneetiliste riskiskooride prognoosimisvõime seotud sellega, millisest GWASist SNPide kaalud võetakse. Ilmnes, et erinevate GWASide põhjal tehtud geneetilised riskiskoorid samale haigusele ei pruugi olla üksteisega eriti korreleeritud ning seetõttu sõltub konkreetse isiku jaoks geneetilise eelsoodumuse hindamine vaadeldavast geneetilisest riskiskoorist ega ole seega üheselt määratud. Veel uuriti, kuidas geneetiliste riskiskooride jaotus käitub erinevates etnilistes populatsioonides. Leiti, et geneetiliste riskiskooride jaotus sõltub uuritavate populatsioonide geneetilisest struktuurist ning seetõttu ei saa geneetilise riskiskoori abil geneetilist eelsoodumust määrata populatsioonistruktuuri arvesse võtmata. Viimaks uuriti kolme tuntud mittegeneetilist riskiskoori ja nende prognoosivõimet kardiovaskulaarhaiguste jaoks EGV andmetes. Kaks riskiskoori olid Eesti andmetes hästi kalibreeritud, kuid kõige uuem ja keerulisema algoritmiga neist (QRISK2) alahindas tekkivate juhtude arvu. Samuti selgus, et antud mittegeneetiliste skooridega kaasas käivate ravijuhiste järgi tuleks pea pooltele keskealistele meestele ning veerandile keskealistele naistele, kes uuringus osalesid, soovitada kolesterooli alandavate ravimite manustamist südameveresoonkonna haiguste riski vähendamiseks.The prices of genotyping and whole genome sequencing have been decreasing rapidly over the past few years. Due to that, genotypic data has become available in large quantities, allowing for extensive investigation of the genetic background of many common complex diseases. The most studied genetic variants are single nucleotide polymorphisms (SNPs). Each SNP separately tends to have a small effect on common complex diseases. However, by combining the effects of many SNPs together into one variable – called genetic risk score (GRS) – one can compose a useful predictor for determining the genetic predisposition for a disease. In this thesis, a new method called doubly-weighting will be introduced, which allows for inclusion of many uncorrelated markers instead of including only few genome-wide significant ones from genome-wide association study(GWAS) and at the same time, intends to correct for winner’s curse bias problem. We illustrate its predictive ability under several scenarios with both simulations and Estonian Biobank data to show that it systematically performs better than more simple methods. In the second article, it was investigated how the selection of GWAS study affects the predictive ability of GRSs for breast cancer. We also tried combining several GRS together into one metaGRS to achieve the best predictive genetic score. We also addressed the problem that different genetic risk scores with similar predictive ability are not necessarily highly correlated for the same disease. Another important aspect influencing the predictive ability of GRSs is the similarity between discovery and target dataset of which the GRS is intended for. This is investigated in the third article, where it is showed that the distributions of GRSs heavily depend on ancestral background of the population. In the fourth article, three known non-genetic risk scores for ASCVD are validated in the Estonian Biobank data. Two of them were well calibrated, but the newest and most complicated algorithm developed in the UK estimated almost twice as less cases than observed. We also compared the statin treatment recommendations based on guideline specific criteria and found that statins for primary prevention were recommended for almost half of the men and quarter of women under investigation, illustrating high risk levels of ASCVD in Estonia.https://www.ester.ee/record=b522905

    Mitmemõõtmeline analüüs peptiidide käitumise uurimiseks

    Get PDF
    Tuberkuloosivaktsiini väljatöötamisel on oluline uurida, millised antigeenid vaktsiinis leiduvate antikehadega reageerivad. Samuti on tähtis, et antikehad reageeriksid kiiresti ning nõutud koguses pärast vaktsineerimist. Tavaliselt mõõdetakse ühel patisendil vaktsineerimise järel mitmel ajahetkel sadade erinevate antigeenide ehk peptiidide käitumist. Seega on tekkivate andmete maht väga suur ning nende analüüsimiseks ei ole statistika klassikalised meetodid alati sobilikud. Käesolev töö tutvustab üht metoodikat sellise struktuuriga andmete analüüsiks ning uurib selle rakendatavust konkreetselt andmestikul. Esimene peatükk tuletab meelde mitmemõõtmelise normaaljaotust, maatriksnormaaljaotust ja Wisharti jaotust, sest neid kasutatakse hilisemas teoorias korduvalt. Teises peatükis antakse ülevaade kahest lihtsamast uuringudisainist: esimesena sellisest, kus võrreldakse kahte ravimit, mõõtes samadel või sarnastel objektidel tunnust ning teisena sellisest, kus omavahel võrreldakse ravimit, mõõtes samadel objektidel vaid 1 tunnust. Mõlema juhul on kirjas testitav hüpoteesipaar, võrdlemiseks vajaliku statistiku konstrueerimine ning tema jaotust. Teine peatükk valmistab lugejat ette 5. peatüki jaoks, kus esitatakse nende kahe disaini üks võimalikke üldistusi. Kolmandas peatükis käsitletakse profiilianalüüsi, mis on mõeldud erinevate populatsioonide keskväärtusvektorite võrdlemiseks. Antakse ülevaade profiilianalüüsi käigust, püstitatavatest hüpoteesidest ja nende kontrollist võrreldes 2 populatsiooni. Neljandas peatükis tutvustatakse mitmemõõtmelist dispersioonanalüüsi (MANOVAt), mis on vajalik keskväärtuste võrdlemisel, kui võrreldavaid populatsioone on 2-st rohkem. Kirjeldatakse MANOVA kasutamiseks vajalikke eeldusi ning teststatistiku konstrueerimist. Viiendas peatükis antakse ülevaade mitmemõõtmeliste korduvmõõtmiste tulemusena saadud andmete ühest võimalikust analüüsiviisist ehk mitmemõõtmelisest juhumõjudega korduvmõõtmiste mudelist. Andmed olgu sellise struktuuriga, et igal indiviidil on mõõdetud ajahetkel tunnust ning võrreldavaid gruppe saab olla . Peatükis tuuakse välja juhumõjudega mudeli kuju, mudeli parameetrite hinnangud, mudeli parameetrite kohta käivate hüpoteeside üldine kuju ning nende testimiseks vajalik eeskiri. Viimases peatükis kirjeldatakse töös kasutatavate andmete bioloogilist tausta ja tekkimise protsessi. Samuti tutvustatakse konkreetset töös kasutatavat andmestikku, rakendatakse profiilianalüüsi ja proovitakse hinnata eelnevas peatükis esitatud juhumõjudega mudel. Profiilianalüüsi eesmärgiks on uurida, kas leidub peptiide, mille korral antikehade kontsentratsioon ajas ei ole muutunud. Juhumõjudega mudeli hindamise eesmärgiks on peptiidide kasvukõverate kujude võrdlemine, leidmaks käitumiselt sarnaseid peptiide

    Validating the doubly weighted genetic risk score for the prediction of type 2 diabetes in the Lifelines and Estonian Biobank cohorts

    Get PDF
    As many cases of type 2 diabetes (T2D) are likely to remain undiagnosed, better tools for early detection of high-risk individuals are needed to prevent or postpone the disease. We investigated the value of the doubly weighted genetic risk score (dwGRS) for the prediction of incident T2D in the Lifelines and Estonian Biobank (EstBB) cohorts. The dwGRS uses an additional weight for each single nucleotide polymorphism in the risk score, to correct for “Winner's curse” bias in the effect size estimates. The traditional (single-weighted genetic risk score; swGRS) and dwGRS were calculated for participants in Lifelines (n = 12,018) and EstBB (n = 34,129). The dwGRS was found to have stronger association with incident T2D (hazard ratio [HR] = 1.26 [95% confidence interval: 1.10–1.43] and HR = 1.35 [1.28–1.42]) compared to the swGRS (HR = 1.21 [1.07–1.38] and HR = 1.25 [1.19–1.32]) in Lifelines and EstBB, respectively. Comparing the 5-year predicted risks from the models with and without the dwGRS, the continuous net reclassification index was 0.140 (0.034–0.243; p =.009 Lifelines), and 0.257 (0.194–0.319; p < 2 × 10−16 EstBB). The dwGRS provided incremental value to the T2D prediction model with established phenotypic predictors. It clearly distinguished the risk groups for incident T2D in both biobanks thereby showing its clinical relevance

    Global Biobank analyses provide lessons for developing polygenic risk scores across diverse cohorts

    Get PDF
    Polygenic risk scores (PRSs) have been widely explored in precision medicine. However, few studies have thoroughly investigated their best practices in global populations across different diseases. We here utilized data from Global Biobank Meta-analysis Initiative (GBMI) to explore methodological considerations and PRS performance in 9 different biobanks for 14 disease endpoints. Specifically, we constructed PRSs using pruning and thresholding (P + T) and PRS-continuous shrinkage (CS). For both methods, using a European-based linkage disequilibrium (LD) reference panel resulted in comparable or higher prediction accuracy compared with several other non-European-based panels. PRS-CS overall outperformed the classic P + T method, especially for endpoints with higher SNP-based heritability. Notably, prediction accuracy is heterogeneous across endpoints, biobanks, and ancestries, especially for asthma, which has known variation in disease prevalence across populations. Overall, we provide lessons for PRS construction, evaluation, and interpretation using GBMI resources and highlight the importance of best practices for PRS in the biobank-scale genomics era.</p

    Global Biobank Meta-analysis Initiative:Powering genetic discovery across human disease

    Get PDF
    Biobanks facilitate genome-wide association studies (GWASs), which have mapped genomic loci across a range of human diseases and traits. However, most biobanks are primarily composed of individuals of European ancestry. We introduce the Global Biobank Meta-analysis Initiative (GBMI)—a collaborative network of 23 biobanks from 4 continents representing more than 2.2 million consented individuals with genetic data linked to electronic health records. GBMI meta-analyzes summary statistics from GWASs generated using harmonized genotypes and phenotypes from member biobanks for 14 exemplar diseases and endpoints. This strategy validates that GWASs conducted in diverse biobanks can be integrated despite heterogeneity in case definitions, recruitment strategies, and baseline characteristics. This collaborative effort improves GWAS power for diseases, benefits understudied diseases, and improves risk prediction while also enabling the nomination of disease genes and drug candidates by incorporating gene and protein expression data and providing insight into the underlying biology of human diseases and traits.</p

    The trans-ancestral genomic architecture of glycemic traits

    Get PDF
    Glycemic traits are used to diagnose and monitor type 2 diabetes and cardiometabolic health. To date, most genetic studies of glycemic traits have focused on individuals of European ancestry. Here we aggregated genome-wide association studies comprising up to 281,416 individuals without diabetes (30% non-European ancestry) for whom fasting glucose, 2-h glucose after an oral glucose challenge, glycated hemoglobin and fasting insulin data were available. Trans-ancestry and single-ancestry meta-analyses identified 242 loci (99 novel; P < 5 x 10(-8)), 80% of which had no significant evidence of between-ancestry heterogeneity. Analyses restricted to individuals of European ancestry with equivalent sample size would have led to 24 fewer new loci. Compared with single-ancestry analyses, equivalent-sized trans-ancestry fine-mapping reduced the number of estimated variants in 99% credible sets by a median of 37.5%. Genomic-feature, gene-expression and gene-set analyses revealed distinct biological signatures for each trait, highlighting different underlying biological pathways. Our results increase our understanding of diabetes pathophysiology by using trans-ancestry studies for improved power and resolution. A trans-ancestry meta-analysis of GWAS of glycemic traits in up to 281,416 individuals identifies 99 novel loci, of which one quarter was found due to the multi-ancestry approach, which also improves fine-mapping of credible variant sets.Peer reviewe

    Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps

    Get PDF
    We expanded GWAS discovery for type 2 diabetes (T2D) by combining data from 898,130 European-descent individuals (9% cases), after imputation to high-density reference panels. With these data, we (i) extend the inventory of T2D-risk variants (243 loci,135 newly implicated in T2D predisposition, comprising 403 distinct association signals); (ii) enrich discovery of lower-frequency risk alleles (80 index variants with minor allele frequency 2); (iii) substantially improve fine-mapping of causal variants (at 51 signals, one variant accounted for >80% posterior probability of association (PPA)); (iv) extend fine-mapping through integration of tissue-specific epigenomic information (islet regulatory annotations extend the number of variants with PPA >80% to 73); (v) highlight validated therapeutic targets (18 genes with associations attributable to coding variants); and (vi) demonstrate enhanced potential for clinical translation (genome-wide chip heritability explains 18% of T2D risk; individuals in the extremes of a T2D polygenic risk score differ more than ninefold in prevalence).Peer reviewe
    corecore